Gpu-based third-order low-rank tensor completion method and the apparatus

ABSTRACT

The present disclosure provides a GPU-based third-order low-rank tensor completion method. Operation steps of the method includes: (1) transmitting, by a CPU, input data DATA1 to a GPU, and initializing the loop count t=1; (2) obtaining, by the GPU, a third-order tensor Yt of a current loop t based on the least squares method; (3) obtaining, by the GPU, a third-order tensor Xt of the current loop t based on the least squares method; (4) checking, by the CPU, whether an end condition is met; and if the end condition is met, turning to (5); otherwise, increasing the loop count t by 1 and turning to (2) to continue the loop; and (5) outputting, by the GPU, output data DATA2 to the CPU. In the present disclosure, in the third-order low-rank tensor completion, a computational task with high concurrent processes is accelerated by using the GPU to improve computational efficiency.

CROSS REFERENCE

This disclosure is based upon and claims priority to Chinese PatentApplication No. 201910195941.8, filed on Mar. 15, 2019, titled“GPU-based third-order low-rank tensor completion method”, and theentire contents of which are incorporated herein by reference.

TECHNICAL FIELD

The present disclosure relates generally to the field of highperformance computing technologies, and more specifically, to a GPU(Graphics Processing Unit)-based third-order low-rank tensor completionmethod.

BACKGROUD

High-dimensional data in real world can be naturally represented astensors. Data loss often occurs in transmission of wireless sensornetworks, and therefore the collected sensory data is often incomplete.In scenarios where computing and network resources are limited, peopleuse partial measurement to reduce the amount of data that needs to beprocessed and transmitted, and this also leads to incomplete data. Howto recover complete data from the incomplete data is a research hotspotin recent years. A common approach is to model the incomplete data aslow-rank tensors and then perform recovery by exploiting redundantfeatures in the data.

The present disclosure mainly focuses on data completion of third-orderlow-rank tensors. The existing research proposes some CPU-basedthird-order low-rank tensor completion methods. For example, calculationfor singular value decomposition of a large block diagonal matrix needsto be performed in each iteration of the TNN-ADMM method, and aniteration is performed simultaneously in the time domain and thefrequency domain. Therefore, a large number of Fourier transforms andinverse Fourier transforms need to be performed. As a result, algorithmcalculation is more time consuming. The accuracy and speed of theTubal-Alt-Min method based on the alternating least squares method isbetter than that of the TNN-ADMM method, but computational efficiency ofthe Tubal-Alt-Min method is still not high. Generally, in the CPU-basedthird-order low-rank tensor data completion method, the running timeincreases exponentially with the increase of tensor size. Therefore, theCPU-based method is not suitable for processing of large-scale tensors.

A GPU has a characteristic of high parallelism and a high memorybandwidth, and therefore the GPU is widely applied to accelerate variouscalculations. The powerful computing power of the GPU provides a goodfoundation for accelerating completion of third-order low-rank tensordata.

SUMMARY

To address existing issues in the prior art, the objective of thepresent disclosure is to provide a third-order low-rank tensorcompletion method for the low-tubal-rank tensor model based on GPU.

The technical solution of the present disclosure is as follows:

A GPU-based third-order low-rank tensor completion method is provided,and operation steps include:

Step 1: transmitting, by a CPU, input data DATA1 to a GPU, andinitializing the loop count t=1;

Step 2: obtaining, by the GPU, a third-order tensor Y_(t) of a currentloop t based on the least squares method;

Step 3: obtaining, by the GPU, a third-order tensor X_(t) of the currentloop t based on the least squares method;

Step 4: checking, by the CPU, whether an end condition is met; and ifthe end condition is met, turning to Step 5; otherwise, increasing theloop count t by 1 and turning to Step 2 to continue the loop; and

Step 5: outputting, by the GPU, output data DATA2 to the CPU.

Step 1 includes:

Step 1.1: allocating memory space in a GPU memory;

Step 1.2: transmitting the input data DATA1 in a CPU memory to theallocated memory space in the GPU memory, wherein the DATA1 includes thefollowing data:

(1) A third-order to-be-completed tensor T∈R^(m×n×k), wherein R denotesthe set of real numbers, m, n, and k are respectively sizes of thetensor T in the first, second, and third dimensions, the total number ofelements of the tensor is m×n×k, and an element of the tensor that thefirst, second, and third dimensions are respectively i, j, and l isdenoted as T_(i,j,l).

(2) An observation set S⊆[o]×[p]×[q], where o≤m, p≤n, and q≤k, where [o]denotes the set {1,2, . . . , o}, [p] denotes the set {1,2, . . . , p},and [q] denotes the set {1,2, . . . , q}.

(3) An observation tensor TP∈R^(m×n×k) based on the observation set Sand the to-be-completed tensor T∈R^(m×n×k), wherein TP is represented byusing an observation function ObserveS( ) for T, that is,TP=ObserveS(T), and the observation function ObserveS( ) is defined as:

$\begin{matrix}{{TP}_{i,j,l} = \left\{ {\begin{matrix}{T_{i,j,l},} & {{{if}\mspace{14mu} \left( {i,j,l} \right)} \in S} \\{0,} & {otherwise}\end{matrix}.} \right.} & (1)\end{matrix}$

In the method of the present disclosure, values of those elementsTP_(i,j,l) in TP if (i,j,l)∈S are accurate and do not need to becompleted.

(4) A tubal-rank r of the tensor T, wherein T(i,j,:) is defined as the(i, j)^(th) tube of the tensor T∈R^(m×n×k), and is a vector ((T(i,j,1),T(i,j,2), . . . , T(i,j,k)) with length k; the rank of the third-ordertensor is defined as the number of non-zero tubes of θ in singular valuedecomposition of the tensor T=U*θ*V, wherein * denotes a multiplicationoperation of the third-order tensor, and is defined as performing tensormultiplication on third-order tensors A∈R^(n1×n2×k) and B∈R^(n2×n3×k) toobtain a third-order tensor C∈R^(n1×n3×k):

C (i,j,:)=Σ_(s=1) ^(n2)A(i,s,:)*B(s,j,:), for i∈[n1] and j∈[n3] (2),where [n1] denotes the set {1, 2, . . . , n1} and [n3] denotes the set{1, 2, . . . , n3}.

A(i,s,:)*B(s,j,:) denotes a circular convolution for the tubes/vectorsA(i,s,:) and B(s,j,:).

Step 1.3: initializing the loop variable t=1 on the CPU.

Step 2 includes:

obtaining, based on the least squares method on the GPU, a third-ordertensor Y_(t) of the current loop whose loop count is recorded as t:

$\begin{matrix}{Y_{t} = {{\underset{Y_{t} \in R^{r \times n \times k}}{argmin}\left( {{Norm}\left( {{ObserveS}\left( {T - {X_{t - 1}*Y_{t}}} \right)} \right)} \right)}^{2}.}} & (3)\end{matrix}$

The operator * in X_(t-1)*Y_(t) in the formula (3) denotes thethird-order tensor multiplication defined in the formula (2), andX_(t-1) is the tensor X_(t-1)∈R^(m×r×k) obtained after the end of theprevious loop (that is, the (t−1)^(th) loop), and X₀∈R^(m×r×k) requiredby the first loop is initialized to an arbitrary value, such as a randomtensor; the operator—in T-X_(t-1)*Y_(t) denotes tensor subtraction, thatis, elements with an equal index (i,j,l) that are corresponding to thetwo tensors are subtracted; ObserveS( ) is an observation functiondefined in the formula (1), and Norm( ) is a function for obtaining atensor norm, and is defined as:

$\begin{matrix}{{{Norm}\left( {T \in R^{m \times n \times k}} \right)} = {\sqrt{\sum_{i = 1}^{m}{\sum_{j = 1}^{n}{\sum_{ = 1}^{k}\sum_{i,j,}^{2}}}}.}} & (4)\end{matrix}$

Step 3 includes:

obtaining, based on the least squares method on the GPU, a third-ordertensor X_(t) of the current loop whose loop count is recorded as t:

$\begin{matrix}{X_{t} = {{\underset{X_{t} \in R^{m \times r \times k}}{argmin}\left( {{Norm}\left( {{ObserveS}\left( {T - {X_{t}*Y_{t}}} \right)} \right)} \right)}^{2}.}} & (5)\end{matrix}$

The operator * in X_(t)*Y_(t) in the formula (5) denotes the third-ordertensor multiplication defined in the formula (2), Y_(t) is the tensorY_(t)∈R^(r×n×k) obtained in Step 2, and the operator—in T-X_(t)*Y_(t)denotes tensor subtraction, that is, elements with an equal index(i,j,l) that are corresponding to the two tensors are subtracted;ObserveS( ) is the observation function defined in the formula (1), andNorm( ) is the tensor norm defined in the formula (4).

Step 4 includes:

checking, by the CPU, whether the end condition is met; and if the endcondition is met, turning to Step 5; otherwise, increasing the loopvariable t by 1 and turning to Step 2 to continue the loop.

Step 5 includes:

transmitting, by the GPU, the output data DATA2 to the CPU, wherein theDATA2 includes third-order tensors X∈R^(m×r×k) and Y∈R^(r×n×k) obtainedin the last loop.

Compared with the prior art, the present disclosure has the followingprominent substantive features and significant technical progresses.

In the present disclosure, in third-order low-rank tensor completion, acomputational task with high concurrent processes is accelerated byusing a GPU to improve computational efficiency. Compared withconventional CPU-based third-order low-rank tensor completion,computational efficiency is significantly improved, and same calculationcan be completed by using less time.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of steps of a GPU-based third-order low-rank tensorcompletion method of the present disclosure.

FIG. 2 is a schematic diagram of a third-order tensor.

FIG. 3 is a schematic diagram illustrating another embodiment of thisdisclosure wherein the apparatus comprises an input device, an outputdevice, a CPU, a GPU, memory so that the input device and the outputdevice are all communicably connected through a bus.

DETAILED DESCRIPTION

To make the objectives, technical solutions, and advantages of thepresent disclosure clearer, the following further describes the presentdisclosure in detail with reference to the accompanying drawings and thepreferred embodiments. It should be understood that, the specificembodiments described herein are merely intended for explaining thepresent disclosure, but not for limiting the present disclosure.

A third-order tensor is shown in FIG. 2. A first dimension of the tensoris also referred to as a row, and a size of the row is m, a seconddimension is also referred to a column, and a size of the column is n, asize of the third dimension is k. In this way, a real value tensor maybe denoted as T∈R^(m×n×k), and a complex value tensor may be denoted asT∈C^(m×n×k), where C denotes the set of complex numbers. T(i,j,l)denotes an element that the first, second, and third dimensions of thetensor T are respectively i, j, l, T(i,j,:) denotes one-dimensionalvector formed by k elements: T(i,j,1), T(i,j,2), . . . , T(i,j,k), andthe one-dimensional vector T(i,j,:) is along the third dimension, and isreferred to the (i,j)^(th) tube of the tensor. The tensor T∈R^(m×n×k)has m×n tubes.

Embodiment 1

A GPU-based third-order low-rank tensor completion method is provided.As shown in FIG. 1, steps are as follows:

Step 1: A CPU transmits input data DATA1 to a GPU, and initializes theloop count t=1.

Step 2: The GPU obtains a third-order tensor Y_(t) of a current loop tbased on the least squares method.

Step 3: The GPU obtains a third-order tensor X_(t) of the current loop tbased on the least squares method;

Step 4: The CPU checks whether an end condition is met; and if the endcondition is met, turns to Step 5; otherwise, increases the loop count tby 1 and turns to Step 2 to continue the loop; and

Step 5: The GPU outputs output data DATA2 to the CPU.

Embodiment 2: This Embodiment Is Basically The Same As Embodiment 1, AndSpecial Features Are As Follows

Step 1 includes the following steps:

Step 1.1: Memory space is allocated in a GPU memory.

Step 1.2: The input data DATA1 in a CPU memory is transmitted to theallocated memory space in the GPU memory. The DATA1 includes thefollowing data:

(1) A third-order to-be-completed tensor T∈R^(m×n×k), wherein R denotesthe set of real numbers, m, n, and k are respectively sizes of thetensor T in the first, second, third dimensions, the total number ofelements of the tensor is m×n×k, and an element of the tensor that thefirst, second, and third dimensions are respectively i,j, and l isdenoted as T_(i,j,l).

(2) An observation set S⊆[o]×[p]×[q], wherein o≤m, p≤n, q≤k.

(3) An observation tensor TP∈R^(m×n×k) based on the observation set Sand the to-be-completed tensor T∈R^(m×n×k), wherein TP is represented byusing an observation function ObserveS( ) for T, that is,TP=ObserveS(T), and the observation function ObserveS( ) is defined as:

$\begin{matrix}{{TP}_{i,j,l} = \left\{ {\begin{matrix}{T_{i,j,l},} & {{{if}\mspace{14mu} \left( {i,j,l} \right)} \in S} \\{0,} & {otherwise}\end{matrix}.} \right.} & (1)\end{matrix}$

In the method of the present disclosure, values of those elementsTP_(i,j,l) in TP if (i,j,l)∈S are accurate and do not need to becompleted.

(4) A tubal-rank r of the tensor T, wherein T(i,j,:) is defined as the(i,j)^(th) tube of the tensor T∈R^(m×n×k), and is a vector ((T(i,j,1),T(i,j,2), . . . , T(i,j,k)) with length k; the tubal-rank of thethird-order tensor is defined as the number of non-zero tubes of θ insingular value decomposition of the tensor T=U*θ*V, wherein * denotes amultiplication operation of the third-order tensor, and is defined asperforming tensor multiplication on third-order tensors A∈R^(n1×n2×k)and B∈R^(n2×n3×k) to obtain a third-order tensor C∈R^(n1×n3×k):

C(i,j,:)=Σ_(s=1) ^(n2)A(i,s,:)* B(s,j,:), for i∈[n1] and j∈[n3] (2),where [n1] denotes the set {1, 2, . . . , n1} and [n3] denotes the set{1, 2, . . . , n3}.

A(i,s,:)*B(s,j,:) denotes performing a circular convolution for thetubes/vectors A(i,s,:) and B(s,j,:).

Step 2 includes the following step:

A third-order tensor Y_(t) of the current loop whose loop count isrecorded as 1 is obtained based on the least squares method on the GPU:

$\begin{matrix}{Y_{t} = {{\underset{Y_{t} \in R^{r \times n \times k}}{argmin}\left( {{Norm}\left( {{ObserveS}\left( {T - {X_{t - 1}*Y_{t}}} \right)} \right)} \right)}^{2}.}} & (3)\end{matrix}$

The operator * in X_(t-1)*Y_(t) in the formula (3) denotes thethird-order tensor multiplication defined in the formula (2), andX_(t-1) is the tensor X_(t-1)∈R^(m×r×k) obtained after the end of theprevious loop (that is, the (t−1)^(th) loop), and X₀∈R^(m×r×k) requiredby the first loop is initialized to an arbitrary value, such as a randomtensor; the operator—in T-X_(t-1)*Y_(t) denotes tensor subtraction, thatis, elements with an equal index (i,j,l) that are corresponding to thetwo tensors are subtracted; ObserveS( ) is an observation functiondefined in the formula (1), and Norm( ) is a function for obtaining atensor norm, and is defined as:

$\begin{matrix}{{{Norm}\left( {T \in R^{m \times n \times k}} \right)} = {\sqrt{\sum_{i = 1}^{m}{\sum_{j = 1}^{n}{\sum_{ = 1}^{k}\sum_{i,j,}^{2}}}}.}} & (4)\end{matrix}$

Step 3 includes the following step:

A third-order tensor X_(t) of the current loop whose loop count isrecorded as t is obtained based on the least squares method on the GPU:

$\begin{matrix}{X_{t} = {{\underset{X_{t} \in R^{m \times r \times k}}{argmin}\left( {{Norm}\left( {{ObserveS}\left( {T - {X_{t}*Y_{t}}} \right)} \right)} \right)}^{2}.}} & (5)\end{matrix}$

The operator * in X_(t)*Y_(t) in the formula (5) denotes the third-ordertensor multiplication defined in the formula (2), Y_(t) is the tensorY_(t)∈R^(r×n×k) obtained in Step 2, and the operator—in T-X_(t)*Y_(t)denotes tensor subtraction, that is, elements with an equal index(i,j,l) that are corresponding to the two tensors are subtracted;ObserveS( ) is the observation function defined in the formula (1), andNorm( ) is the tensor norm defined in the formula (4).

Step 4 includes the following step:

The CPU checks whether the end condition is met; and if the endcondition is met, turns to Step 5; otherwise, increases the loopvariable t by 1 and turns to Step 2 to continue the loop. The endcondition may be that a certain loop count or an error threshold isreached.

Step 5 includes the following step:

The GPU transmits the output data DATA2 to the CPU, wherein the DATA2includes third-order tensors X∈R^(m×r×k) and Y∈R^(r×n×k) obtained in thelast loop. A third-order tensor T′=X*Y after completion can be obtainedby performing tensor multiplication defined in the formula (2) for X andY.

The disclosure further provides an apparatus, comprising a CPU 100; aGPU 200, communicably connected with the CPU 100; a memory 300communicably connected with the CPU 100 and GPU 200 for storinginstructions executable by the CPU 100 and GPU 200, to perform any ofthe abovementioned methods. Referring to FIG. 3, in one embodiment ofthis disclosure, the apparatus also comprises an input device 400 andoutput device 500, and the CPU 100, the GPU 200, the memory 300, theinput device 400, and the output device are all communicably connectedthrough a bus 600.

What is claimed is:
 1. A GPU-based third-order low-rank tensorcompletion method, wherein operation steps comprise: Step 1:transmitting, by a CPU, input data DATA1 to a GPU, and initializing theloop count t=1; Step 2: obtaining, by the GPU, a third-order tensorY_(t) of a current loop t based on the least squares method; Step 3:obtaining, by the GPU, a third-order tensor X₁ of the current loop tbased on the least squares method; Step 4: checking, by the CPU, whetheran end condition is met; and if the end condition is met, turning toStep 5; otherwise, increasing the loop count t by 1 and turning to Step2 to continue the loop; and Step 5: outputting, by the GPU, output dataDATA2 to the CPU.
 2. The GPU-based third-order low-rank tensorcompletion method according to claim 1, wherein Step 1 comprises: Step1.1: allocating memory space in a GPU memory; Step 1.2: transmitting theinput data DATA1 in a CPU memory to the allocated memory space in theGPU memory, wherein the DATA1 comprises the following data: (1) athird-order to-be-completed tensor T∈R^(m×n×k), wherein R denotes theset of real numbers, m, n, and k are respectively sizes of the tensor Tin the first, second, and third dimensions, the total number of elementsof the tensor is m×n×k, and an element of the tensor that the first,second, and third dimensions are respectively i, j, and l is denoted asT_(i,j,l); (2) an observation set S⊆[o]×[p]×[q], wherein o≤m, p≤n, q≤k;(3) an observation tensor TP∈R^(m×n×k) based on the observation set Sand the to-be-completed tensor T∈R^(m×n×k), wherein TP is represented byusing an observation function ObserveS( ) for T, that is,TP=ObserveS(T), and the observation function ObserveS( ) is defined as:$\begin{matrix}{{TP}_{i,j,l} = \left\{ {\begin{matrix}{T_{i,j,l},} & {{{if}\mspace{14mu} \left( {i,j,l} \right)} \in S} \\{0,} & {otherwise}\end{matrix},} \right.} & (1)\end{matrix}$ wherein values of those elements TP_(i,j,l) in TP if(i,j,l)∈S need to be accurate and do not need to be completed; and (4) atubal-rank r of the tensor T, wherein T(i,j,:) is defined as the(i,j)^(th) tube of the tensor T∈R^(m×n×k), and is a vector ((T(i,j,1),T(i,j,2), . . . , T(i,j,k)) with length k; the rank of the third-ordertensor is defined as the number of non-zero tubes of θ in singular valuedecomposition of the tensor T=U*θ*V, wherein * denotes a multiplicationoperation of the third-order tensor, and is defined as performing tensormultiplication on third-order tensors A∈R^(n1×n2×k) and B∈R^(n2×n3×k) toobtain a third-order tensor C∈R^(n1×n3×k): c(i,j,:)=Σ_(s=1)^(n2)A(i,s,:)*B(s,j,:), for i∈[n1] and j∈[n3] (2), whereinA(i,s,:)*B(s,j,:) denotes performing a circular convolution for thetubes/vectors; and Step 1.3: initializing the loop variable t=1 on theCPU.
 3. The GPU-based third-order low-rank tensor completion methodaccording to claim 1, wherein Step 2 comprises: obtaining, based on theleast squares method on the GPU, a third-order tensor Y_(t) of thecurrent loop whose loop count is recorded as t: $\begin{matrix}{{Y_{t} = {\underset{Y_{t} \in R^{r \times n \times k}}{argmin}\left( {{Norm}\left( {{ObserveS}\left( {T - {X_{t - 1}*Y_{t}}} \right)} \right)} \right)}^{2}},} & (3)\end{matrix}$ wherein the operator * in X_(t-1)*Y_(t) in the formula (3)denotes the third-order tensor multiplication defined in the formula(2), and Xt_(l-1) is the tensor X_(t-1)∈R^(m×r×k) obtained after the endof the previous loop (that is, the (t−1)^(th) loop), and X₀∈R^(m×r×k)required by the first loop is initialized to an arbitrary value, such asa random tensor; the operator—in T-X_(t-1)*Y_(t) denotes tensorsubtraction, that is, elements with an equal index (i,j,l) that arecorresponding to the two tensors are subtracted; ObserveS( ) is anobservation function defined in the formula (1), and Norm( ) is afunction for obtaining a tensor norm, and is defined as: $\begin{matrix}{{{Norm}\left( {T \in R^{m \times n \times k}} \right)} = {\sqrt{\sum_{i = 1}^{m}{\sum_{j = 1}^{n}{\sum_{ = 1}^{k}\sum_{i,j,}^{2}}}}.}} & (4)\end{matrix}$
 4. The GPU-based third-order low-rank tensor completionmethod according to claim 1, wherein Step 3 comprises: obtaining, basedon the least squares method on the GPU, a third-order tensor X₁ of thecurrent loop whose loop count is recorded as 1: $\begin{matrix}{{X_{t} = {\underset{X_{t} \in R^{m \times r \times k}}{argmin}\left( {{Norm}\left( {{ObserveS}\left( {T - {X_{t}*Y_{t}}} \right)} \right)} \right)}^{2}},} & (5)\end{matrix}$ wherein the operator * in X_(t)*Y_(t) in the formula (5)denotes the third-order tensor multiplication defined in the formula(2), Y_(t) is the tensor Y_(t)∈R^(r×n×k) obtained in Step 2, and theoperator—in T-X_(t)*Y_(t) denotes tensor subtraction, that is, elementswith an equal index (i,j,l) that are corresponding to the two tensorsare subtracted; ObserveS( ) is the observation function defined in theformula (1), and Norm( ) is the tensor norm defined in the formula (4).5. The GPU-based third-order low-rank tensor completion method accordingto claim 1, wherein Step 4 comprises: checking, by the CPU, whether theend condition is met; and if the end condition is met, turning to Step5; otherwise, increasing the loop variable t by 1 and turning to Step 2to continue the loop.
 6. The GPU-based third-order low-rank tensorcompletion method according to claim 1, wherein Step 5 comprises:transmitting, by the GPU, the output data DATA2 to the CPU, wherein theDATA2 comprises third-order tensors X∈R^(m×r×k) and Y∈R^(r×n×k) obtainedin the last loop.
 7. An apparatus comprising: a CPU; a GPU, communicablyconnected with the CPU; and a memory communicably connected with the CPUand GPU for storing instructions executable by the CPU and GPU, toperform the method according to claim
 1. 8. A method for a GPU-basedthird-order low-rank tensor completion comprising the steps of: (1)transmitting, by a CPU, input data DATA1 to a GPU, and initializing theloop count t=1; (2) obtaining, by the GPU, a third-order tensor Y_(t) ofa current loop t based on the least squares method; (3) obtaining, bythe GPU, a third-order tensor X_(t) of the current loop t based on theleast squares method; (4) checking, by the CPU, whether an end conditionis met; and (5) if the end condition is met, outputting, by the GPU,output data DATA2 to the CPU.
 9. A method as recited in claim 8, furthercomprising the steps of: if the end condition is not met, increasing theloop count t by 1; and returning to step (2).